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ABSTRACT 


This  document  describes  two  problem  areas  concerning 
systems  subject  to  periodic  checkout,  which  were  investi¬ 
gated  with  the  aid  of  a  computer.  Thefirstpart  (Parameter 
Estimation)  summarizes  the  results  of  a  Monte  Carlo  analy¬ 
sis  to  determine  the  feasibility  and  accuracy  of  measuring 
system  failure  rates,  checkout  error  probabilities,  and 
repair  effectiveness  from  the  numbers  of  systems  passing 
andfailing  in  three  consecutive  checkouts.  The  second  part 
(Availability  Analysis)  describes,  mainly  through  a  series 
of  graphs,  the  quantitative  variation  of  system  availability 
with  a  number  of  operational  and  maintenance  parameters 
representing  time  durations  of  standby,  checkout,  and  re¬ 
pair,  failure  rates  during  standby  and  checkout,  repair 
effectiveness,  decision  errors  during  checkout,  and  test 
coverage. 


PREFACE  TO  THE  REVISED  PRINTING 


This  edition  corrects  the  equation  for  P^^  given  in  Figure  4, 
page  28.  Figures  B-l  to  B-46,  in  Appendix  B,  have  been  revised 
to  correspond  to  the  corrected  equation.  The  changes  in  the 
figures  are,  for  the  most  part,  relatively  minor,  and  none  of  the 
conclusions  of  the  report  require  modification.  (Some  of  the 
changes  in  the  figures  are  only  apparent,  due  to  a  change  in  scale 
of  the  abscissa). 

A  typographical  error  in  the  equation  for  (PFP)  on  page  12 
has  also  been  corrected  (thanks  to  H.  Jaffe). 
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I.  INTRODUCTION  AND  SUMMARY 


A.  Introduction 

One  of  the  major  factors  contributing  to  the  effectiveness  of  the 
Atlas  ballistic  mis  side  weapon  system  is  availability  or  alert  readi¬ 
ness.  Mathematical  models  relating  this  factor  to  the  hardware, 
procedures  and  personnel  aspects  of  the  system  have  been  developed 
and  are  described  in  References  1,  2  and  5.  Concurrently  with  the 
development  of  these  models,  there  has  been  an  investigation  into 
the  question  of  how  to  estimate  or  measure  the  parameters  used  in 
the  availability  equations.  This  question  was  first  considered  in 
detail  in  Reference  3,  which  describes  two  possible  methods  of 
parameter  estimation  for  periodically  checked  systems.  Another 
method  is  described  in  Reference  6.  Parameter  estimation  for 
continuously  monitored  systems  is  considered  in  Reference  4. 

The  significance  of  the  parameter  estimation  problem  for  the  alert 
readiness  models  arises  from  the  following  conditions:  • 

(1)  The  models  include  the  possible  effects  of  checkout  error, 
incomplete  test  coverage,  and  imperfect  repair;  these  effects 
cannot  be  measured  directly  by  means  of  routine  failure  and 
maintenance  data,  as  the  data  itself  includes  these  errors  to  an 
unknown  extent. 

(2)  Although  it  is  theoretically  possible  to  initiate  a  comprehensive 
program  of  failure  analysis  and  operational  surveillance  to  pro¬ 
vide  the  kind  of  data  required,  there  is  no  data  of  this  sort -pres¬ 
ently  available,  nor  is  such  a  program  contemplated.  Moreover, 
the  cost  of  such  a  program  could  well  be  prohibitive. 

For  these  reasons,  the  possibility  of  using  existing  or  easily  obtain¬ 
able  data  to  estimate  the  model  parameters  by  inferential  or  indirect 
methods  has  been  analyzed  in  considerable  detail.  Reference  3  de¬ 
scribes  two  basic  methods  which  could  be  used  for  indirect  parameter 
estimation.  In  order. to  check  the  accuracy  and  potential  usefulness 
of  these  methods,  a  series  of  Monte  Carlo  trials  was  run  on  the  - 
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7090  computer.  The  trials  were  conducted  using  various  input 
parameter  values  and  sample  sizes,  and  estimated  parameter  values 
were  automatically  computed  using  the  estimating  equations,  in  the 
same  manner  as  would  be  the  case  if  actual  field  data  were  available. 
Comparison  of  the  estimated  with  the  true  parameter  values  allows 
an  assessment  of  the  effectiveness  of  the  estimating  procedures. 

Section  II  discusses  the  two  estimation  methods,  the  corresponding 
computer  programs,  and  the  results  obtained  to  date. 

When  the  attempt  is  made  to  account  for  most  of  the  potentially 
important  factors  affecting  alert  readiness,  the  resulting  mathemati¬ 
cal  models  yield  complicated  equations  for  availability,  with  numerous 
parameters.  A  general  expression  for  the  alert  readiness  of  a  peri¬ 
odically  checked  system  is  given  in  Reference  2  (Equation  54,  p.  10). 
This  equation  includes  all  parameters  used  in  the  mathematical 
models  to  date  to  characterize  periodically  checked  systems.  The 
resulting  equation  is  sufficiently  complex  that  a  computer  program 
was  developed  to  investigate  the  effects  of  the  various  parameters  on 
system  availability.  To  simplify  presentation  of  results,  a  plotter 
was  connected  to  the  computer  output.  Section  III  discusses  the  equa¬ 
tion  and  presents  the  numerical  results  obtained. 

Because  of  the  volume  of  data  summarized  for  the  Monte  Carlo 
trials  and  the  computer  analysis  of  availability,  the  tables  and  graphs 
have  been  placed  in  Appendices  A  and  B,  respectively.  To  aid  in 
interpreting  the  data,  as  well  as  to  aid  in  reading  the  report,  a  fold- 
out  list  of  symbol  definitions  used  in  the  Monte  Carlo  trials  is  provided 
at  the  end  of  Appendix  A,  and  a  similar  fold-out  list  of  symbols  used 
in  the  availability  analysis  appears  at  the  end  of  Appendix  B.  There  is 
some  difference  in  notation  for  the  two  lists,  as  the  computer  programs 
originated  from  different  projects,  and  as  it  is  believed  desirable  to 
to  indicate  the  symbols  as  they  actually  appear  in  the  existing  programs 

For  the  reader  interested  only  in  the  main  results  of  the  com¬ 
puter  analyses,  a  summary  follows. 

B.  Summary 

This  document  reports  the  results  of  two  projects  pertaining  to 
system  availability  which  required  extensive  numerical  analysis. 


-2- 


The  first  project  involved  a  series  of  Monte  Carlo  runs  in  which  the 
time  sequence  of  events  of  individual  units  passing  through  repair, 
standby,  and  checkout  was  simulated,  the  objective  being  to  determine 
if  or  to  what  degree  of  accuracy  the  system  failure  rate,  repair 
effectiveness,  and  error  frequency  during  checkout  could  be  estimated 
from  the  checkout  results  alone.  The  second  project  was  a  parame¬ 
tric  analysis  of  system  availability  using  a  curve  plotting  machine 
connected  to  the  computer  output  to  aid  in  revealing  the  sensitivity 
of  availability  to  the  various  system  operational  and  maintenance 
parameters.  Parameters  included  in  the  analysis  were  standby, 
checkout,  and  repair  periods,  system  failure  rate  during  standby  and 
checkout,  repair  effectiveness,  test  coverage,  and  decision  errors 
during  checkout. 

The  major  results  of  these  projects  are  presented  in  tabular 
and  graphical  form  in  the  appendices.  Appendix  A  is  a  series  of 
tables  describing  the  parameter  estimates  from  the  Monte  Carlo  runs. 
Appendix  B  is  a  representative  selection  of  graphs  obtained  in  the 
parametric  study  of  availability.  While  the  tables  and  graphs  speak 
for  themselves,  some  of  the  more  obvious  results  are  summarized 
below. 

1.  Parameter  Estimation 

Of  the  two  methods  of  indirect  parameter  estimation  for 
periodically  monitored  systems  described  in  Reference  3,  the 
Variable  Standby  Time  Method  and  the  Multiple  Checkout  Method, 
only  the  latter  method  was  subjected  to  a  relatively  thorough 
numerical  analysis  via  Monte  Carlo  runs  on  a  computer.  A  com¬ 
puter  search  routine,  described  in  Reference  7,  was  developed  for 
processing  data  applicable  to  the  Variable  Standby  Time  Method, 
but  a  thorough  numerical  test  was  not  performed  for  this  method. 
Some  small-scale  sample  results  are  reported  in  Reference  7. 

The  numerical  results  described  in  this  report,  and  summarized 
here,  apply  therefore  strictly  to  the  Multiple  Checkout  Method. 

Briefly,  this  method  prescribes  that  a  group  of  systems  enter 
a  standby  period,  followed  by  three  consecutive  checkouts. 

Whether  the  units  are  operating  or  not  during  standby  is  immaterial 


but  the  standby  mode  {which  may  be  of  zero  length)  is  assumed 
the  same  for  all  units.  Basted  exclusively  on  the  number  of  units 
passing  or  failing  on  each  of  the  three  checkouts,  the  problem 
is  to  determine,  if  possible,  the  values  of  the  unit  survival  prob¬ 
abilities  during  standby  (pg)  and  checkout  (pc  ^  and  p^),  the  prob¬ 
ability  the  unit  was  good  upon  completing  repair  (p),  and  the 
probabilities  of  calling  a  good  system  bad  (a)  or  a  bad  system 
good  (1-E)  at  the  point  of  checkout  decision.  It  was  shown  in 
Reference  3  that,  in  general,  the  parameters  a  and  E  can  be 
estimated,  together  with  the  products  pc  ,  pc?  and  pp  p  .It 
was  also  shown  that  if  some  of  the  units  go  directly  from  repair 
into  checkout,  so  that  ps  =  if  and  if  furthermore  all  failures 
during  checkout  occur  prior  to  test  decision,  so  that  pc^  s  1, 
all  of  the  remaining  parameters  canbe  estimated.  (Since  the  prod- 
uctpc^pc^  is  estimated,  it  is  necessary  only  that  there  be  an  assumed 
relationship  between  pc  ^  and  Pc^»  rather  than  pc  =  1  specifically, 
for  it  to  be  possible  to  estimate  all  of  the  parameters.  ) 


The  parameters  a  and  1-E  can  be  called  "error  parameters" 
since  they  describe  Type  I  and  Type  II  decision  errors  during 
checkout,  and  describe  the  apparent,  rather  than  the  actual, 
condition  of  the  units.  The  other  parameters  can  be  called  "state 
parameters,"  since  they  describe  the  actual  condition.  The  Monte 
Carlo  runs  show  that  the  accuracy  of  measuring  the  error  parame¬ 
ters  a  and  E  depends  strongly  upon  their  values  as  well  as  the 
values  of  the  other  (state)  parameters.  The  number  of  units  (or 
sample  size)  required  to  form  reasonable  estimates  varies  from 
about  25  or  50  for  E,  and  100  for  a,  for  the  most  favorable  cases, 
to  10,000  or  more  for  less  favorable  combinations  of  parameter 
values.  There  is  no  apparent  bias  in  the  estimates. 


If  only  one  source  of  error  is  present,  the  accuracy  of  mea-> 
surement  of  that  error  increases  significantly.  For  example, 
with  a  sample  size  of  500  and  nominal  values  of  the  state  parame¬ 
ters  (pg  =  0.  8,  p.  =  1.0,  pc^  =  0.8,  pc^  =  1.0),  and  with 
a  true  value  of  E  =  0.  9,  the  variance  of  E  increases  from  0.  0027 
for  a  s  0  to  0.  1718  for  a  =  0.  3,  or  a  change  in  the  standard 
deviation  of  estimate  of  almost  a  factor  of  10.  Similarly,  with 
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the  same  values  of  the  other  parameters,  if  true  a  =  0.1,  the 
variance  of  a  is  0.  0039  if  E  =  1  and  0.  4182  if  £  =  0.  7.  These 
results  are  from  300  sample  runs  with  500  units  simulated  on 
each  run. 

The  accuracy  of  estimation  degrades  rapidly  if  there  are 
no  true  failures  during  checkout,  as  the  method  of  estimation 
relies  on  there  being  a  possibility  of  change  of  state  of  the  units 
as  they  pass  through  the  three  consecutive  checkouts.  If  no 
change  can  occur,  the  estimating  equations  degenerate,  and, 
as  with  a  single  checkout,  no  estimates  can  be  obtained  for 
a  and  E  unless  additional  information  is  available  or  additional 
assumptions  are  made. 

When  a  series  of  cases  was  run  which  allowed  all  parame¬ 
ters  to  be  estimated,  the  calculation  of  availability  from  these 
estimates  gave  quite  reasonable  values  when  compared  to  the 
true  availability.  A  sample  size  of  1000  was  used,  using  hand 
calculation  on  the  results,  and  the  use  of  smaller  sample  sizes 
should  be  investigated  on  the  computer. 

It  was  found  by  comparison  with  a  sample  result  that  con¬ 
fidence  limits  obtained  on  the  estimates  by  using  the  normal 
distribution  agreed  closely  with  the  percentage  groupings  printed - 
out  by  the  computer.  This  method  is  satisfactory  provided  the 
variance  is  not  too  large. 

2.  Availability  Analysis 

The  availability  equation  for  a  system  subject  to  periodic 
checkout  which  is  discussed  in  this  report  was  derived  in 
Reference  1,  and  is  based  on  the  assumptions  that  failures  during 
standby  have  an  exponential  distribution,  and  standby,  checkout, 
and  repair  time  (if  required)  are  of  fixed  duration.  The  parame¬ 
ters  which  this  equation  quantitatively  relate  to  system  availability 

• 

can  be  grouped  as  follows  in  order  to  summarize: 


Definitions  of  these  parameters  will  be  found  on  a  fold-out  sheet 
at  the  end  of  Appendix  B. 
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(1)  Duration  of  standby,  checkout  and  repair  periods  (To,  T^,  T^) 

(2) 


s  c '  r 

System  failure  parameters  (for  detectable  failures  only)  during 
st 


standby  and  checkout  (X  ,  p^  ,  p^  ) 

tct  tC2 

(3)  Error  probabilities  in  checkout  in  deciding  whether  a  system 
is  good  or  bad  on  the  basis  of  detectable  characteristics  (a,  p) 


(4) 

(5) 


Partial  test  coverage  parameters  (X^,  pu  ,  l-p^-p^) 

c 

Imperfect  repair  parameter  for  detectable  failures  (p^). 

A  series  of  basic  cases  was  run  in  which  the  independent 
variable  was  standby  time  (Tg),  because  of  its  importance  as  a 
parameter  which  is  specified  by  maintenance  policy  and  which 
can  therefore  be  relatively  easily  modified  as  necessary.  As 
is  generally  known,  availability  ^or  probability  of  alert 

readiness)  does  not  usually  increase  or  decrease  monotonically 
with  Tg.  Under  a  given  set  of  conditions,  there  is  some  value 
of  T  which  maximizes  P.  D ,  and  this  value  can  be  considered  as 
a  partial  specification  of  a  maintenance  policy. 

Aside  from  Tg,  of  the  above  list  of  variables,  only  two,  a 
and  pdt  ,  were  found  to  have  the  property  that  did  not 

always  increase  or  decrease  as  the  parameter  varied  with  the 
other  parameters  held  fixed.  Specifically,  decreases  mono¬ 

tonically  as  the  following  increase:  T^,  T^,  X 

1-Pu£  >  1-p.j. 

interactions  are  too  complex  to  summarize;  however,  the  following 
trends  can  be  noted:  Certain  parameters  have  an  important  influ¬ 
ence  on  P^j^  if  the  standby  time  is  short,  but  less  if  the  standby 
time  is  long.  These  are:  Tc,  Tr,pUt  ,  1  -p^-p^,  and  Pdtc^- 
Other  parameters  are  more  important  at  longer  standby  times  than 
at  short.  These  are  Xg  and  X^.  (However,  even  at  short  standby 
times  X^  has  considerable  effect.  )  A  third  group  of  parameters 
affects  PAR  in  a  manner  which  is  less  dependent  on  standby  time 
(but  the  effect  is  less  at  short  times).  These  are  (3  and  p^.  For 
large  values  of  Tg,  P^^  has  approximately  a  linear  relationship 
with  all  parameters  except  Xg  and  X^;  for  these  parameters,  the 


s'  1_pdtc2’  V 

The  manner  of  variation  and  the 


relationship  is  approximately  linear  for  small  Tg.  P^^ 
mately  linear  with  p^  and  l-p1~p2  for  all  values  of  Tg. 
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The  optimum  standby  time  was  found  to  increase  as  the 
following  increase:  Tc,  Tr,  \g,  \u>  p,  1-Pdt  .  i-Pdtc  »  andl*Putc- 
It  usually  increases  as  a  increases.  It  remains  essentially 
unchanged  as  and  1  -  p^  -  p^  vary. 

The  parameters  a  and  p^  are  particularly  interesting  since 
the  maximum  value  of  may  occur  at  either  extreme  (0  or  1, 

as  these  are  probabilities)  or  at  intermediate  values,  depending 
on  the  other  conditions.  (This  property  of  a  was  pointed  out  in 
Reference  8.  )  While  the  parameters  a  and  p<^^  were  incorpor¬ 
ated  into  the  model  to  represent  sources  of  degradation  in  the  forrh 
of  Type  I  errors  and  early  failures  during  checkout,  respectively, 
they  were  found  to  operate  also  as  ^preventive  maintenance” 
parameters.  Due  to  the  possible  accumulation,  with  time,  of 
undetectable  failures  in  the  system,  it  is  best  to  repair/replace 
so-called  "good"  systems  periodically.  Both  a  and  p(jt  can 
operate  to  force  this  result:  in  the  case  of  a,  "good"  systems 

are  sent  to  repair  through  error,  and  in  the  case  of  pd  ,  they 

cc  i 

are  sent  to  repair  through  deliberate  failure  just  prior  to  the 

point  of  test  decision.  Whether  this  preventive  repair/replace- 

ment  is  performed  at  every  checkout,  after  a  certain  maximum 

number  of  checkouts,  or  never,  depends  on  whether  the  optimum 

value  of  a  (or  of  l-p^  )  is  1,  between  0  and  1,  or  0,  respectively. 

tci 

These  remarks  apply  to  undetectable  failures  occurring  during 

standby  and  checkout  (X  and  pU(.  ),  but  not  to  those  introduced 

during  the  repair  process  itself  fl-p^-p^).  If  undetectable  failures 

occur  only  during  repair/replacement,  this  type  of  preventive 

maintenance  is  valueless.  In  addition  to  the  values  of  X.  and  pUf.  , 

u 

the  size  of  Tf  influences  the  optimum  replacement  period,  as  this 
is  a  compensating  time  lost  from  readiness.  Thus,  the  relation¬ 
ship  between  and  a,  or  between  and  pdt  ,  can  change 

slope  depending  on  the  values  of  X.^,  pUt  ,  and  Tr« 
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II.  PARAMETER  ESTIMATION  FOR  SYSTEMS  SUBJECT 
.  TO  PERIODIC  CHECKOUT 

A.  Statement  of  the  Problem 

A  system  which  undergoes  a  normal  cycle  of  standby  and  check¬ 
out  presents  a  number  of  theoretical  and  practical  difficulties  when 
the  attempt  is  made  to  describe  or  predict  the  behavior,  and  in  par¬ 
ticular  the  availability,  of  the  system.  It  is  theoretically  straight¬ 
forward  to  specify  an  optimum  period  between  checkouts,  based  on 
assumed  failure  rates  during  standby  and  checkout,  but  it  is  not  so 
apparent  how  these  rates  are  to  be  obtained.  Whether  failures  "occur" 
during  standby  or  checkout,  if  they  are  discovered  only  during  check¬ 
out,  they  must  be  correctly  assigned  as  being  due  to  standby  or  check¬ 
out  causes;  otherwise,  the  maintenance  plan  may  be  ineffective  or 
even  worthless. 

In  addition  to  the  problem  of  measuring  failure  rates,  other 
equally  important  problems  arise  from  factors  which  are  known  to 
have  potentially  strong  effects  on  availability,  and  which  must  there¬ 
fore  be  carefully  considered  in  devising  maintenance  policy,  but  which 
are,  unfortunately,  difficult  to  measure.  Errors  occur  during  check¬ 
out  in  deciding  whether  a  good  system  is  really  good  or  a  bad  system 
really  bad,  and  further  errors  are  committed  in  repair.  This  means 
a  system  can  enter  standby  in  a  failed  condition,  and  therefore  be  un¬ 
available  for  the  entire  standby  period. 

The  problem,  then,  becomes  one  of  sorting  out  these  contributory 
factors,  assessing  their  possible  influence  through  mathematical  analy¬ 
sis,  and  measuring  their  values  in  field  operations,  so  that  mainte¬ 
nance  policy  can  be  guided  accordingly.  As  it  does  not  appear  likely 
that  detailed  data  from  engineering  analysis  will  be  available  in  the 
near  future,  attention  was  directed  to  statistical  methods  of  parame¬ 
ter  estimation  (Reference  3).  In  order  to  discuss  the  measurement 
methods  in  detail,  the  model  parameters  (previously  defined  in 
References  1,  2  and  3)  are  reviewed  below. 
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B.  Parameter  Definitions 


Figure  1  is  a  schematic  diagram  of  the  sequence  of  events 
occurring  in  a  periodic  checkout  policy.  ^ 

_  _  T  *»«)  T 

Tr  Ts  C1  c2 


/ - * - 

(*) 

- * - 

<ps> 

(Pc  > 

C1 

. — v 

(Pc  ) 
c2 

REPAIR 

STANDBY 

CHECKOUT 

Figure  1.  Time  Sequence  for  a  Periodically  Checked  System 

The  cycle  is  assumed  to  start  with  a  repair  period,  followed 
by  a  standby  period,  after  which  there  is  a  checkout  period  which  is 
divided  into  two  parts,  corresponding  to  the  time  intervals  before 
and  after  the  point  of  "test  decision,"  D,  at  which  time  the  test  unit 
is  declared  good  or  bad.  The  symbols  in  the  figure  are  defined  as 
follows: 

Tf  =  Duration  of  repair  period 

Tg  =  Duration  of  standby  period 

T  =  Duration  of  checkout  interval  prior  to  test  decision 
C1 

T  =  Duration  of  checkout  interval  after  test  decision 
C2 

D  =  Point  (in  time)  of  test  decision 

E  s  Probability  that  a  unit  which  is  failed  at  D  will  be  declared 
bad  at  D 

a  =  Probability  that  a  unit  which  is  good  at  D  will  be  declared 
bad  at  D 

p  =  Probability  that  a  unit  is  good  at  completion  of  repair 

pg  =  Probability  that  a  unit  which  is  good  at  entrance  to  standby 
is  still  good  at  completion  of  standby 

p  =  Probability  that  a  unit  which  is  good  at  entrance  to  checkout 

1  is  still  good  at  D 

Pc  =  Probability  that  a  unit  which  is  good  at  D  is  still  good  at 

2  completion  of  checkout 
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The  interaction  of  these  parameters  can  be  seen  by  checking 
the  different  possible  ways  in  which  units  can  pass  through  the 
activities  of  Figure  1.  For  example,  a  unit  can  be  repaired  satis¬ 
factorily,  fail  during  standby,  but  pass  the  checkout.  If  the  actual 
condition  of  the  unit  during  the  time  line  sequence  is  described  in 
terms  of  its  condition  at  the  points  immediately  after  repair,  immedi¬ 
ately  after  standby,  the  point  of  test  decision,  and  the  end  of  checkout, 
there  are  five  possible  "histories"  of  the  unit  at  these  four  points, 
where  G  means  "good"  and  B  means  "bad":  GGGG,  GGGB,  GGBB, 
GBBB,  BBBB.  No  G  can  be  preceded  by  a  B,  as  there  is  no  repair 
after  the  initial  repair.  It  is  assumed  that  the  repair  is  of  the  replace¬ 
ment  type,  so  that  the  condition  of  the  (old)  unit  upon  entering  repair 
does  not  affect  the  probability  that  the  (new)  unit  will  be  good  upon 
exiting  from  repair. 

With  this  many  parameters,  if  the  analysis  is  based  only  on 
the  test  decision  results,  the  individual  parameters  cannot  be  esti¬ 
mated  explicitly  unless  some  variation  occurs  in  the  basic  time  line 
sequence  of  Figure  1.  As  noted  in  the  Introduction,  two  specific 
variations  were  considered  in  Reference  3,  and  it  was  recommended 
in  that  document  that  the  question  of  confidence  limits  for  the  parame¬ 
ters  be  investigated,  using  Monte  Carlo  procedures.  These  programs 
and  the  numerical  results  are  described  in  the  following  two  parts. 

A  third  method  of  parameter  estimation  is  described  in  Reference  6. 
This  method  estimates  parameters  on  the  basis  of  the  probability 
distribution  of  the  number  of  cycles  to  first  repair. 

C.  Variable  Standby  Time  Method 

This  method  of  parameter  estimation  is  applicable  when  a  series 
of  units  experience  different  standby  times,  between  checkouts.  Given 
a  minimum  of  three  different  standby  times  among  a  group  of  systems 
for  which  data  are  available,  Reference  3  provides  equations  for  esti¬ 
mating  E,  X,  and  M-Pcj  (a  -  E)-  The  data  required  to  make  these 
estimates  is,  for  each  system,  the  standby  time  and  whether  or  not 
the  system  passed  the  checkout. 

If  there  are  exactly  three  different  standby  times,  whose  lengths 
are  in  the  ratio  one-two-three,  the  parameter  estimation-equations 
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can  be  solved  analytically.  The  solution  of  the  equations  is  provided 
in  Refer£rice  3.  Parameter  estimation  in  this  case  is  therefore 
straightforward,  and  the  accuracy  of  estimation  will  depend  only  on 
the  "noise"  in  the  data  due  to  its  random  nature.  The  average  standby 
time  should  be  of  the  order  of  l/X;  otherwise  large  inaccuracies  are 
possible  unless  the  sample  is  extremely  large. 

For  more  than  three  standby  times,  the  equations  whose  solu¬ 
tion  is  required  for  the  maximum  liklihood  estimates  cannot  be  solved 
analytically.  A  series  of  computer  search  routines  was  therefore 
developed  by  the  Applied  Mathematics  Department,  Programming 
and  Applied  Mathematics  Laboratory  (STL),  as  reported  in  Reference  7. 
The  solution  of  these  equations  proved  to  be  exceedingly  difficult  and 
time  consuming,  so  that  time  did  not  allow  more  than  a  few  prelimin¬ 
ary  computer  runs  for  specific  examples.  These  examples  all  used 
expected  values  as  the  input  data,  and  in  most  cases,  the  right 
answers  were  obtained.  These  examples  are  discussed  in  Reference  7. 
A  later  example,  in  which  "noise"  was  introduced  into  the  data  through 
a  randomization  process  based  on  expected  values,  provided  incon¬ 
clusive  results. 

A  program  is  available  which  is  workable  under  most  circum¬ 
stances,  should  the  opportunity  arise  to  process  data  pertaining  to 
variable  standby  times.  For  a  further  discussion  of  this  program  and 
its  limitations,  the  reader  is  referred  to  Reference  7. 

D.  Multiple  Checkout  Method 

The  derivation  and  use  of  this  method  of  parameter  estimation  . 
require  that  three  consecutive  checkouts  are  performed  following 
standby,  as  shown  in  Figure  2. 


T  D1  T 


'r 

‘s 

_ ^ _ 

C1 

• _ a 

c2 

-*•  _ 

VI  V 

C1  t 
1 

REPAIR 

STANDBY 

|  CHECKOUT  | 

CHECKOUT 

CHECK- 

Al  IT 

Figure  2.  Time  Sequence  for  a  System  Undergoing  Multiple  Checkouts 
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As  before,  the  cycle  is  assumed  to  begin  with  a  repair,  followed 
by  a  standby  period.  This  is  followed  by  three  consecutive  checkouts, 
with  no  repair  being  performed,  regardless  of  the  test  decisions  at 
Dj,  D^,  and  D^.  Note  that  the  actual  condition  of  a  unit  can  change 
(from  good  to  bad)  in  passing  from  one  checkout  to  the  next,  since 
failures  can  occur  during  checkout. 

Three  test  decisions  are  thus  obtained  for  each  unit  undergoing 
the  sequence  of  Figure  2.  These  individual  records  can  then  be 
grouped  into  eight  categories,  corresponding  to  the  eight  possible 
outcomes  at  Dj,  D^,  and  D^,  where  P  denotes. passing  the  test  and 
F  denotes  failing:  (PPP),  (PPF),  (PFP),  (PFF),  (FPP),  (FPF),  (FFP), 
and  (FFF). 

The  probabilities  of  these  sequences  can  be  written  in  terms 
of  the  variables  shown  in  Figure  2.  For  example,  by  considering 
the  mutually  exclusive  ways  of  producing  the  sequence  (PFP),  the 
probability  is  obtained  as 

Pr(PFP)  =  W,pJ  pS|l  -a)  +  (ipspc^l  -PcjPc  )*'  -«)(*-£) 

+  •pc1Pc2)Q(1  -a,(1  -  E)+  (‘  -  WsPc,)'1  -E>2e 

For  convenience,  as  in  Reference  3,  the  variables  x  =  pp  p  (a  -  E) 

8  C  ^ 

and  t  =  pc  pc  will  be  introduced.  Equations  (i)  to  (8)  below  give 
1  2 

the  probabilities  of  the  sequences  in  terms  of  E,  a,  x  and  t. 


Pr(ppp) 

=  ;^(l  -  o)xt[t(l  -  a)  +  (t  -  fij  +  (1  -  E)Z(1-  E  -  x) 

(1) 

Pr(PPF) 

=  (1  -  a)  xt[t(l  -  u)  -  e]  +  E(1  -  E)(l  -  E  -  x) 

(2) 

Pr(PFP) 

=  -(1  -  o)  xt[at  -  (1  -  E)]  +  E(1  -  E)(l  -  E  -  x) 

(3) 

Pr(PFF) 

=  (1  -  u)  xt(at  +  E)  +  EZ(1  -  E  -  x) 

(4) 

P  (FPP) 
r 

=  -axt[t(i  -  a)  +  (1  -  E)]  +  (1  -  E)2(E  +  x) 

(5) 
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(6) 


Pr(FPF)  =  axt !  ( 1  -  a)t  -  E  !  +  E(1  -  E)(E  +  x) 

P  (FFP)  =  -  axt  at  -  (1  -  E)  i  +  E(1  -  E)(E  +  x)  (7) 

r 

P  (FFF)  =  axt{at  +  E)  +  EZ(E  +  x)  (8) 

r 


In  these  equations,  it  is  of  interest  to  note  the  duality  property, 
that  if  all  P's  are  changed  to  F's  for  any  given  expression,  and  vice 
versa,  the  correct  new  expression  is  obtained  by  merely  substituting 
1  -  a  for  a  and  1  -  E  for  E  (and,  therefore,  -x  for  x,  since  x  involves 
a  and  E  through  the  term  a  -  E).  For  example,  by  comparing  the 
equations  for  P^(PPP)  and  Pf(FFF),  it  is  seen  that  the  latter  is  obtained 
from  the  former  by  substituting  1  -  a  for  a,  1  -  E  for  E,  and  -x  for  x. 

4 

1.  Parameter  Estimation  Equations 

The  above  equations  can  be  solved  to  obtain  estimates  for  E, 
a,  x  ant  t.  The1  estimates  used  in  the  computer  analysis  to  be 
described  are  as  follows: 


E' 


(PFF)  -  (FPF) 


(9) 


;  _  (FPF)  -  (FFP)  +  (PPF)  -  (PFP) 

^  )  r?  t"'  T”'  \  /  ▼—*  \  i  l  \  /  t— *  t  ^ 


(FPP)  +  (FPF)  +  (FFP)  +  (FFF) 

X  —  — — — -  rv — 1  - 


In  these  equations,  the  symbols  (PFF),  (FPF),  etc.,  represent 
the  numbers  of  units  experiencing  those  particular  sequences,  and 
M  denotes  the  total  number  of  units  in  the  sample  (M  =  (PFF)  + 

(FPF)  +  -..). 

As  pointed  out  in  Reference  3,  of  the  parameters  E,  a,  p,  p 

s 

pc^,  and  pc^»  onlY  E  and  a  are  obtained  directly  from  the  Multiple 
Checkout  Method,  unless  some  of  the  standby  times  are  0,  or  unless 
some  knowledge  is  assumed  about  oneor  more  of  the  parameters.  Esti¬ 
mates  of  certain  combinations  of  the  other  parameters  are  also  obtained. 
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Attention  was  focused  on  the  parameters  E  and  a  in  setting 
up  the  computer  runs,  and  the  following  discussion  reflects  this 
fact.  However,  it  should  be  emphasized  that  the  other  parameters 
can  be  estimated  under  certain  conditions,  and  the  ability  to  mea¬ 
sure  these  parameters  can  be  investigated  through  a  modification 
to  the  existing  computer  program.  First,  assume  that  immedi¬ 
ately  after  repair,  all  units  are  subjected  to  three  consecutive 
checkouts.  All  parameters  except  pg  can  then  be  estimated  if 

some  relationship  is  known  between  pc  and  pc  (or  if  the  value 

1  2 

of  one  of  these  parameters  is  known).  For  example,  it  might 

2 

be  known  that  pr  =  pr  . 

Ci  rc2 

Or,  if  the  test  decision  occurs  at  or  near  the  end  of  checkout, 
or  if  the  environmental  stress  is  minimal  in  checkout  after  the 


point  of  test  decision,  it  might  be  assumed  that  pc 


«  1.  As  an 


estimate  for  t 


Pc,Pc- 


is  available,  we  can  then  estimate  both 


p-  and  p„  .  Knowing  a,  E,  and  p_  ,  the  estimate  of  x  =  pp  (a  -  E) 

C1  c2  C1  ci 

can  be  used  to  estimate  p.  This  completes  the  list  of  unknown 

parameters,  except  for  p  . 

s 

Second,  if  a  group  of  similar  units  pass  from  repair  into 
standby  for  some  fixed  period,  and  then  are  given  a  single  check¬ 
out,  the  data  from  these  units  can  be  combined  with  that  from  the 
units  above,  which  had  three  checkouts  without  entering  standby, 
to  estimate  pg.  The  estimate  of  x  =  ppgpc^(a  -  E)  for  the  group 
undergoing  standby  when  divided  by  the  estimate  of  x  =  ppc  (a  -  E) 
for  the  group  without  standby  gives  an  estimate  of  pg. 

When  estimates  are  available  for  all  of  these  parameters, 
an  estimate  can  also  be  made  for  availability.  Thus,  the  accuracy 
of  estimating  availability  could  be  analyzed.  However,  as  it  was 
not  an  objective  of  the  present  computer  program  to  investigate 
these  possibilities,  only  a  few  hand  calculations  were  applied  to 
the  output  from  the  present  program.  (See  Paragraph  3(d)  below.  ) 

2.  Computer  Program 

A  computer  program  was  devised  to  simulate  test  results  in 
order  to  check  the  characteristics  of  the  estimating  Equations 
(9)  to  (12).  In  the  program,  a  specified  number  of  units  is  processed 


through  the  activities  shown  in  Figure  2,  and  for  each  unit  a 
random  number  is  generated  at  each  critical  point  in  the  time 
sequence  to  determine  what  actually  happens.  For  example, 
for  the  first  unit  a  random  number  is  drawn  and  compared  with 
p.  (the  probability  of  successful  repair).  If  the  number  is  smaller 
than  p,  the  repair  is  successful  and  another  random  number  is 
drawn  and  compared  with  pg  (the  probability  of  surviving  standby), 
and  so  on.  If  the  original  random  number  had  not  been  less  than 
p,  the  repair  would  have  been  unsuccessful.  Since  the  unit  was 
then  in  a  failed  condition,  it  would  remain  in  a  failed  condition 
for  the  rest  of  the  sequence  of  events,  as  no  repair  is  performed. 
Therefore,  no  further  random  numbers  would  be  drawn  until  the 
first  test  decision,  where  a  number  would  be  drawn  and  compared 
with  E.  Numbers  would  similarly  be  drawn  at  the  next  two  points 
of  test  decision.  Thus,  the  minimum  number  of  random  numbers 
generated  in  the  sequence  is  four — one  at  repair  and  one  at  each 
test  decision — whereas,  if  the  unit  survives  the  entire  sequence 
(except,  perhaps,  for  the  last  check  period),  ten  random  numbers 
will  be  required. 

A  series  of  individual  histories  is  generated  in  this  manner. 
While  the  computer  "knows"  which  particular  units  are  good  and 
which  bad,  this  information  is  not  printed  out.  As  in  an  actual 
set  of  test  data,  the  data  printed  out  by  the  program  is  the  number" 
of  units  passing  and  failing  at  each  test  decision,  and  the  numbers 
of  units  with  each  of  the  eight  possible  test  histories,  (PPP), 
(PPF),  etc. 

The  data  output  from  this  program,  written  for  the  7090 
computer,  is  extremely  rapid.  About  90,  000  items  can  be  run 
through  the  test  sequence  in  one  minute  of  machine  time.  This 
includes  the  calculation  of  means  and  variances  of  the  estimators 
and  the  printout  of  results.  Because  of  the  computation  speed, 
it  was  possible  to  investigate  a  large  number  of  cases  with  dif¬ 
ferent  parameter  values,  and  a  satisfactory  range  of  sample 
sizes. 
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The  program  input  requires  specification  of  the  parameters 

a,  E,  p,  p  ,  pr  ,  and  p  ;  sample  size,  M;  and  the  number  of 
8  1  CZ 

runs,  N,  to  be  made  with  all  of  the  above  numbers  held  fixed. 

The  printed  output  is  in  two  parts,  a  summary  printout  for  N 
runs,  and  an  optional  detailed  printout  for  each  run.  The  summary 
printout  lists  as  a  heading  the  true  values  of  the  input  parameters 
and  sample  size,  and  the  true  (computed)  values  of  t  =  p 


C„P, 


and 


x  =  WsPc  (a  -  E).  The  following  statistics  are  then  printed  out 

1  /\  A  a  A 

for  each  of  the  estimators,  a,  E,  x,  and  t:  average  (for  the  M 

units  and  N  runs),  variance  from  average,  variance  from  true, 
maximum  for  each  parameter,  minimum  for  each  parameter, 
and  the  fraction  of  values  within  5,  10,  20  and  50  percent  of  the 
average.  In  addition,  the  average,  variance,  maximum,  and 
minimum  of  the  numbe  S  of  units  failing  each  of  the  three  check¬ 
outs  is  printed.  The  detailed  printout  lists  for  each  run  the 
number  of  units  in  each  possible  sequence  PPP,  PPF,  etc.,  and 
the  values  of  each  of  the  estimators. 


Note  that  the  program  does  not  print  out  the  numbers  of 
units  actually  good  and  bad,  but  only  the  numbers  of  units 
passing  and  failing  each  checkout. 


3.  Discussion  of  Results 

Confidence  limits  cannot  be  prescribed  for  any  of  the  parame¬ 
ters  unless  there  is  the  possibility  of  real  failures  during  checkout. 
This  is  not  a  serious  limitation,  as  generally  this  possibility  will 
exist.  The  Multiple  Checkout  Method  is  most  useful  when  there 
is  a  significant  change  in  the  state  probabilities  from  checkout  to 
checkout.  This  is  because  of  the  nature  of  the  estimating  equations, 
which  degenerate  when  there  are  no  failures  during  checkout. 

If  pc  ^  =  pc^  =  l(t  =  1),  Equations  (1)  to  (8)  reduce  to  four  equations, 
since  the  probability  of  a  particular  sequence  of  P's  and  F's  does 

not  depend  on  their  order;  therefore,  P  (PPF)  =  P  (PFP)  *  P  (FPP) 

r  r  r 

and  Pr(PFF)  =  P^(FPF)  =  Pr(FFP).  With  these  probabilities  equal, 
the  estimating  Equations  (9)  to  (11)  have  an  expectancy  of  0  in  both 
the  numerator  and  the  denominator.  Also,  Equation  (12)  depends 
on  (9).  It  will  therefore  be  assumed  that  t  /  1. 
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The  numerical  results  are  tabulated  in  Appendix  A.  Not  all 
of  the  computer  summary  output  is  shown  because  of  space  require¬ 
ments.  The  tables  show  the  mean  and  variance  from  average  for 
and  a  for  each  case.  The  true  values  of  the  parameters  used 
in  each  case  are  shown,  and  the  values  for  M  (number  of  units  in 
the  sample)  and  N  (number  of  runs).  The  number  of  runs  required 
to  provide  an  approximately  correct  value  for  the  variances  was 
determined  by  a  preliminary  series  of  runs  in  which  M  was  held 
fixed  and  N  was  varied  to  indicate  at  what  point  the  variance  does 
not  differ  widely  from  one  series  of  N  runs  to  the  next.  The  larger 
the  true  variance  is,  the  larger  N  should  be  to  estimate  it  accur¬ 
ately;  but  as  it  is  of  no  value  to  have  accurate  estimates  of  large 
variances,  a  compromise  was  made  in  which  N  is  large  enough 
to  approximate  the  true  variance,  providing  it  is  small  enough 
to  be  of  some  use.  Therefore,  in  the  tables,  variances  larger 
than  about  0.  1  may  be  inaccurate. 

a.  Combined  Variation  of  ps,  a,  and  E  (Tables  A- 1  to  A- 9) 

The  first  series  of  tables,  Tables  A-l  to  A-9,  report 
the  results  of  the  most  extensive  series  of  runs,  which  were 
for  combinations  of  the  following  parameter  values: 

E  =  1.  0,  0.  9,  0.  7 

a  =  0,  0.  1,  0.  3 

pg  =  1.  0,  0.  8,  0.  5 

p  =  0.  8,  p  =1.0,  p  =  1.  0 

C1  c  2 

For  the  combination  a  =  0,  E  =  1,  the  estimates  are 

A  A 

exact:  a  becomes  indentically  0  and  E  identically  1.  There¬ 
fore,  this  combination  was  not  run.  Since  ppg  always  appears 
as  a  product  in  the  equations,  the  tables  are  the  same  as  if 
pg  were  held  fixed  and  p  varied. 

The  tables  show  the  marked  effect  of  the  true  value  of  E  on 
the  accuracy  of  estimating  both  E  and  a,  and,  similarly,  for  the 
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true  value  of  a.  For  p  =  1,  if  E  =  1  and  a  =  0.  1 ,  a  reason- 

s 

ably  accurate  estimate  of  both  E  and  a  is  obtained  with  M=  300; 
whereas  if  E  =  0.  7,  the  sample  size  must  be  increased  to 
about  1000  for  comparable  accuracy  —  similarly,  for  a  change 
in  a  from  0  to  0.  3,  with  E  =  0.  9.  This  is  shown  in  Table  1 
below. 


Table  1,  Effect  of  True  Values  of  E  and  a  on  Estimation  Accuracy 


E 

a 

M 

A 

Variance 

& 

Variance 

1.  0 

0.  1 

100 

0.  1562 

0.  0242 

300 

0.  0149 

0.  0053 

500 

0.  0081 

0.  0032 

1000 

0.  0035 

0.  0015 

0.  7 

0.  1 

100 

0.  5742 

0.  6026 

300 

0.  0625 

0.  4254 

500 

0.  0269 

0.  1992 

1000 

0.  0118 

0.  0163 

0.  9 

0 

100 

0.  0096 

0.  1023 

300 

0.  0025 

0.  0097 

500 

0.  0018 

0.  0063 

1000 

0.  0009 

0.  0031 

0.  9 

0.  3 

100 

1.  2113 

0.  9668 

300 

0.  7569 

0.  2929 

500 

0.  0444 

0.  0575 

1000 

0.  0148 

0.  0152 

The  table  shows  that  if  only  one  of  the  two  types  of  error 
is  present — even  though  this  fact  is  not  known  —  the  estimate 
for  the  other  type  of  error  becomes  much  more  accurate.  If 
it  is  known  that  only  one  type  of  error  is  present.  Equations 
(1)  to  (8)  become  different,  and  give  rise  to  different  estima¬ 
tors  for  E  or  a  (whichever  is  present).  If  it  is  known  that 
a  =  0,  an  estimator  for  E  is 

&  _  (FFF) 

E  "  TFFFTT7FFF7  <13) 
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This  estimator  has  a  smaller  variance  than  that  given  by 
Equation  (9);  for  one  thing,  it  is  seen  that  Equation  (13)  must 
lie  between  0  and  1,  whereas  Equation  (9)  frequently  gets 
bigger  than  1. 

If  it  is  known  that  E  3  1,  an  estimator  for  a  is 


(14) 


This  is  a  better  estimator  than  Equation  (10),  and  lies  between 
0  and  1,  whereas  Equation  (10)  frequently  becomes  negative. 
The  properties  of  these  estimators  were  not  investigated  on 
the  computer. 

As  pg  decreases  from  1,  other  parameters  being  held 
constant,  the  tables  show  how  the  accuracy  of  estimating 
E  and  a  diminishes.  Table  2  was  constructed  to  indicate 
this  fact  as  well  as  to  show  the  combined  effects  of  varying 

E,  a,  and  p  .  In  Table  2,  M  is  fixed  at  500,  p  =  1,  p_  =  0.  8, 

s  1 

and  p  3  1.  The  table  demonstrates  a  large  effect  on  accu- 
2 

racy  as  E,  p  ,  and  1  -  a  decrease  from  their  maximum  values. 

®  /\  2 

When  p  s  1,  <r  =  0.  0104  when  a  3  0.  1  and  E  3  0.  9,  and 
s  a  A  £ 

when  p  3  0.  5  for  the  same  values  of  a  and  E,  =  0.4287. 
s  a  A  2 

Or,  when  pg  =  0.  8,  if  a  3  0,  as  E  decreases  to  0.  7,  tr^, 
increases  to  only  0.  0134,  and  if  E  3  1,  as  a  increases  to 
0.  3,  <r_^  increases  to  0.  0511;  but  if  both  effects  occur  simul- 

■k  2 

taneously,  increases  to  1.  3742.  As  mentioned  earlier, 

values  of  large  variances  are  not  accurate.  This  is  indicated 

here,  since  this  value  (1.  3742)  is  less  than  the  corresponding 

value  for  p  3  1,  when  it  can  be  expected  to  be  larger.  Another 
S  '\  2 

anomaly  is  seen  for  the  value  of  <j-'  when  p  =  0.  5,  E  =  0.  9 

a  s 

and  a  3  0.  3,  which  value  is  less  than  the  corresponding  value 

for  pg  =  0.  8.  An  even  more  obvious  case  is  the  decrease 
A  9 

in  for  increasing  a  when  pg  =  0.  5  and  E  =  0.  7. 
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0 

0.  1 

0.  3 

0.  0032 

0.  0192 

0.  0063 

0.  0104 

0.  0575 

0.  0195 

0.  1992 

0.  7782 

a 

E 

0 

0.  1 

0.  3 

1.  o 

0.  0113 

0. 0511 

0.  9 

0.  0027 

0.  0150 

0.  1718 

0.  7 

0.  0134 

0.  1027 

1.  3742 

n  c  A  2 

Ps-0.5  «rE 

a 

E 

0 

0.  1 

0.  3 

0.0181  0.3281 

0.  0073  0.  0555  1.  6381 

0.  6273  0.  8417  2.9079 


0.0039  0.0321 

0.0203  0.0268  0.  7532 
0.  2489  0.4182  1.4979 


N.  a 

E  \ 

0 

0.  1 

0.  3 

1.  0 

— 

0.  0072 

0.  0579 

0.  9 

0.  1554 

0.  4287 

0.  6936 

0.  7 

3.  6239 

2.  5981 

2.  0662 

Effects  of  Other  Parameters  (Tables  A-10  to  A-13) 

As  regards  the  effects  of  the  other  parameters,  p,  pc^f 

and  p  ,  it  was  mentioned  earlier  that  p  has  the  same  effect 
2 

as  pg,  and  that  if  pc  =  pc  =  1,  no  estimates  can  be  made. 

1  2 

The  values  chosen  in  the  tables  just  discussed.  Tables  A-l 

to  A-9,  were  p  =  0.  8  and  p_  =  1.  0.  These  values  were 
**  1  c2 

not  chosen  on  the  basis  of  providing  good  estimates  for  E 


and  a,  as  other  values  are  known  which  will  give  better 

estimates.  They  were  meant  to  represent  "reasonable" 

values  for  the  common  situation  in  which  the  test  decision 

occurs  near  the  end  of  checkout,  or,  in  general,  where 

there  is  little  chance  of  failure  after  test  decision.  For  a 

fixed  value  of  t  s  pr  p„  ,  the  lower  p^  is,  the  more  accurate 

1  c  2  c2 

the  estimate  of  either  E  or  a.  This  unexpected  result  can 
be  interpreted  to  mean  (assuming  equal  environmental 
stresses  before  and  after  test  decision)  that  the  earlier  in 
time  that  the  test  decision  is  made,  the  better.  This  is  not 
to  say,  however,  that  "snap  judgment"  is  preferred,  as  it 
is  assumed  that  the  true  E  and  a  are  not  a  function  of  how 
quickly  the  decision  is  made. 

Tables  A- 10  and  A- 11  show  the  results  for  pr  =  0.5, 

1 

p„  s  1.  0,  and  pr  =  1.  0,  pr  •=  0.  5,  respectively,  with 
c2  1  2 

the  parameters  E,  p,  and  pg  at  their  maximum  values,  and 

for  various  values  of  a.  It  is  seen  that  good  estimates  of 

a  are  obtained  with  pc  *  1.  0,  pc  s  0.  5  (Table  A- 11),  even 

1  2 

for  a  sample  size  of  100.  Worse  values  are  obtained  for  a 

and  E  when  the  values  for  p_  and  p_  are  reversed  (Table 

C1  c2 

A- 10).  Table  A- 12  shows  that  increasingly  better  estimates 
of  E  are  obtained  as  pc  decreases,  the  best  case  being  when 
pc  =  0.  Estimates  for  this  case  are  listed  in  Table  A- 13. 

A  sample  of  50  or  even  25  is  adequate,  as  shown  in  the  table. 
(Although  M  s  50  is  the  lowest  value  shown  in  the  table,  a 
comparison  of  the  variance  with  M  shows  a  linear  relation¬ 
ship  which  can  be  used  to  extrapolate  or  interpolate  to  other 
values  of  M.  )  In  Tables  A- 12  and  A- 13,  a  is  identically  0 
and  therefore  is  not  shown. 

Comparison  of  Favorable  Cases  for  a  and  E 
(Tables  A- 14  and  A- 15) _ 

Tables  A- 14  and  A- 15  show  the  effect  of  varying  the 
other  parameters  for  the  "favorable  cases"  (i.e.,  parameter 
combinations  allowing  good  estimates)  for  E  and  a.  In  these 
tables,  as  in  other  tables  in  the  appendix,  parameter  values 
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are  not  repeated  from  case  to  case  if  they  do  not  change. 

For  example,  in  Table  A- 14,  the  second  line  specifies 

E  =  0.  9.  This  means  all  other  parameters  (including  M 

and  N)  remain  the  same.  Study  of  these  tables  shows  that 

A  2 

for  fixed  values  of  the  other  parameters,  of  increases 

A  2  ^ 

approximately  linearly  with  a,  and  increases  approxi¬ 
mately  linearly  with  1  -  E;  and  both  decrease  approximately 
linearly  with  increasing  sample  size.  For  example,  for 
E  *  |i  *  1,  o'  **  5q/M.  This  relationship  could  be  used  to 
find  the  approximate  sample  size  required  to  measure  a 
with  a  specified  accuracy.  If  a  simple  criterion  is  used, 
such  as  a  '  a/2,  then  it  is  found  by  combining  these  two 
equations  that  M  20/a.  This  shows  that  even  though  (as 
found  previously)  the  accuracy  of  measuring  a  increases 
with  decreasing  a,  the  relative  accuracy  (in  terms  of  frac¬ 
tional  error  of  the  true  value)  decreases.  That  is,  using 
the  relationship  between  M  and  a,  we  have 

a  =  0.  05,  M  =  400 
a  =  0.1,  M  -  200 
a  =  0.  2,  M  =  100 

all  giving  the  same  relative  accuracy  of  measuring  Qj  namely, 
a  standard  deviation  of  1/2  the  true  value. 

A  comparison  of  the  tables  for  E  and  a  shows  that,  other 

things  being  equal,  it  is  easier  to  measure  E  accurately  than 

a.  This  is  for  two  reasons:  (1)  is  smaller  than  <r  ^ 

X.  a 

under  "similarly  favorable"  conditions,  and  (2)  E  a  for 
any  useful  system,  so  that  the  relative  accuracy  is  greater 
for  E  even  if  the  a 's  were  the  same. 

d.  Estimate  of  Availability 

Two  samples  of  20  runs  were  made  under  the  condition 

that  pc  =  1  and  with  two  values  of  p  ,  p  =1,  and  p  =  0.  7. 

2  s  s  s 

Under  these  conditions,  all  parameter  values  are  either  known 
or  estimated,  so  that  an  estimate  can  be  made  for  system 
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availability  (for  an  assumed  checkout  and  repair  time). 

Table  A- 16  shows  the  twenty  sets  of  estimates  obtained  for 
E,  a,  pr  ,  p,  pa  and  X  (for  an  assumed  standby  time  of 

^  |  3  S 

1  unit,  i.  e.,  X  =  -inpg).  The  estimates  for  alTparameters 
except  ps  were  obtained  from  the  set  of  runs  with  pg  =  1, 
as  this  is  a  more  favorable  case.  The  estimates  of  pg  were 
obtain  by  matching  pairs  in  order  of  occurrence  in  the  two 
sequences  of  runs,  but  any  order  could  have  been  used. 

The  estimates  for  availability  shown  in  the  table  were 
obtained  from  the  equation  (Reference  5,p.  7): 


T  +  T  +  T  ;  E  +  P(G)e  8  s  p  (a  -  E) 
s  c  r  j  c 


where 


P(G) 


Ep 

- rrr - ; 

l-e  S  Sp  T  -  a  +  p  (a  -  E) 
c  . 


Since  pc  =  1  and  p  =  p  ,  and  estimates  of  all  other  parame- 
2  ^  c  1 

ters  are  provided  in  Table  A- 16,  system  availability  can  be 


estimated  for  assumed  values  of  T 


T  ,  and  T  , 
r  c 


as  shown 


in  the  right-hand  column  of  Table  A- 16.  The  sample  size 
for  this  table  is  M  =  1000. 


e.  Normal  Approximation  to  Distribution  of  Estimators 

To  obtain  confidence  limits,  the  normal  distribution  can 

be  applied  to  the  estimators  if  the  variance  is  of  the  order 

of  0.  01  or  less  (i.  e.,  standard  deviation  =  0.  1  or  less).  An 

example  of  applying  this  approximation  is  given  in  Table  3 

below,  for  the  case  M  =  300,  E  =  1.  0,  p  =  1.  0,  p  =  1.  0, 

s 

Pc  j  =  0.  8,  p  =1.0,  a  =  0.  1. 

u 
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Table  3.  Fraction  of  Values  Within  P  Percent 


of  True  Value 

A 

E 

A 

a 

p 

Percent 

Ac  tual 

Normal 

Approx 

Actual 

Normal 

Approx 

5 

0.  326 

0.  326 

0.  048 

0.  040 

10 

0.  608 

0.  600 

0.  108 

0.  112 

20 

0.  914 

0.  905 

0.  224 

0.  221 

50 

1.  000 

1.  000 

0.  554 

0.  516 

f.  Estimator  Bias 

In  calculating  the  values  of  the  estimators  and  their 
variances  on  the  computer  runs,  no  truncation  was  performed 
on  estimates  greater  than  one  or  less  than  zero.  Since  the 
parameters  are  probabilities,  in  any  actual  case  where  the 
estimate  was  greater  than  one  or  negative,  it  would  be 
rounded  to  one  or  zero,  respectively.  This  could  reduce  the 
variance  about  the  true  parameter  value,  but  would  tend  to 
bias  the  estimates  so  that  the  expected  value  would  not  equal 
the  true  value.  The  reason  for  this  is  that  the  present  untrun¬ 
cated  estimates  are  apparently  unbiased.  The  variances 
shown  in  the  tables  are  about  the  average,  and  these  would 
not  necessarily  decrease  under  truncation.  This  was  not 
investigated  on  the  computer.  Without  truncation,  it  was 
found  that  there  was  little  difference  between  the  variances 
about  the  average  and  about  the  true  values. 
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III.  PARAMETRIC  ANALYSIS  OF  AVAILABILITY  OF  A 
PERIODICALLY  CHECKED  SYSTEM 


A.  Introduction 

In  this  section  the  variables  which  affect  the  availability  of  a 
periodically  monitored  system  will  be  discussed  on  the  basis  of  a 
computer  analysis  which  was  performed  in  which  the  parameters 
were  allowed  to  vary  individually.  To  aid  in  parameter  definition, 
Figure  3  illustrates  again  the  sequence  of  events  assumed  for  a  sys¬ 
tem  subject  to  periodic  checkout.  This  figure  is  similar  to  Figure  1, 
but  additional  parameters  are  indicated. 


POINT  OF  TEST 
DECISION  (a,/ 3) 

H )  I  K  )  , 

C1  •  c2  I 


I  I 


( V  A,) 


CHECKOUT 


REPAIR 

(IF  REQUIRED) 


STANDBY 


TIME  (t) 


Figure  3.  Time  Sequence  for  a  Periodically  Checked  System 

The  symbols  appearing  in  this  diagram  are  defined  as  follows: 

P^j^  =  probability  of  alert  readiness,  or  probability  the  system 
is  in  standby  and  nonfailed  at  a  random  point  in  time 

Tg  =  duration  of  standby  period 

Tc  =  duration  of  checkout  period 

Tf  =  duration  of  repair  period 

a  =  probability  of  calling  a  system  with  no  detectable  failures 
bad  during  checkout 

(3  =  probability  of  calling  a  system  with  a  detectable  failure 
good  during  checkout* 

X  =  rate  of  occurrence  of  detectable  failures  during  standby 


The  parameter  $  used  in  this  section  equals  1  -  E  of  the  previous 
section. 
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rate  of  occurrence  of  undetectable  failures  during  standby 

probability  a  repaired  system  is  unfailed 

probability  a  repaired  system  is  failed  detectably 

probability  a  repaired  system  is  failed  only  undetectably 

probability  of  no  detectable  failures  occurring  during  first 
portion  of  checkout  (prior  to  test  decision) 


probability  of  no  detectable  failures  occurring  during 
second  portion  of  checkout  (after  test  decision) 


probability  of  no  detectable  failures  occurring  during 
checkout  «  Pd*.  Pd*. 

tcl 

probability  of  no  undetectable  failures  occurring  during 
checkout 


p^  =  probability  of  no  detectable  failures  occurring  during 
5  standby  =  exp-(XgTs) 

p  =  probability  of  no  undetectable  failures  occurring  during 
*s  standby  =  exp-(\uTs) 


As  in  Part  II  of  this  report,  the  list  includes  parameters  repre¬ 
senting  the  possibility  of  errors  during  checkout,  failures  induced  by 
checkout,  imperfect  repair,  and  failures  during  standby.  In  addition, 


the  parameters  X^, 


1  -  -  H2>  andpu 


possibility  of  failures  during  standby, 


are  introduced  to  allow  for  the 
repair  and  checkout  which  are 


inherently  undetectable  by  the  checkout  procedures  employed  to  moni¬ 


tor  the  system.  This  characteristic  is  called  "partial  test  coverage," 


and  is  different  from  failures  undetected  due  to  errors  in  equipment 


or  personnel,  as  accounted  for  by  the  parameter  (3.  In  the  model,  a 
system  failure  of  the  type  being  represented  by  Xu  is  never  detected 
by  the  checkout  procedures  employed,  and  will  only  be  discovered 
later  (if  at  all)  at  a  rear  echelon  when  more  thorough  tests  are  per¬ 
formed.  Though  not  detected,  failures  of  this  type  may  be  corrected 
because  of  the  occurrence  of  other  (detectable)  failures,  or  a  false 


alarm,  both  of  which  lead  to  repair  (or  replacement).  For  those 
modes  of  failure  which  are  presumably  covered  by  the  checkout, 
some  will  occasionally  be  missed  due  to  error,  and  this  is  specified 
to  occur  with  a  probability  (3. 
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As  seen  in  the  list  of  definitions,  the  introduction  of  undetectable 
failures  changes  the  meaning  of  a  (and  (3)  slightly  from  that  of  Part  II; 
i.e.,  it  is  no  longer  the  "probability  of  calling  a  good  system  bad," 
since  some  of  the  "good"  systems  which  are  called  bad  through  "error, 
are  actually  bad  because  of  undetectable  failures;  thus,  partial  test 
coverage  together  with  false  alarms  can  lead  to  correct  decisions. 

This  slight  change  in  the  meaning  of  a  will  be  shown  later  to  be  very 
significant. 

Referring  to  Figure  3,  the  system  is  assumed  to  be  assigned  to 

a  fixed  alert  (or  standby)  period  T  ,  during  which  it  is  not  monitored. 

s 

It  is  then  checked  out,  which  requires  a  fixed  time  Tc,  This  time 
is  divided  into  two  parts,  corresponding  to  the  times  before  and  after 
test  decision.  If  the  decision  is  made  that  the  system  is  bad,  it  enters 
repair;  otherwise,  it  re-enters  standby.  Upon  completion  of  repair, 
which  requires  a  fixed  time,  the  system  re-enters  standby  without  a 
subsequent  checkout.  The  repair  period  may  include  some  checkout 
activities,  but  these  are  not  an  explicit  part  of  the  model.  If  there  were 
a  specific  provision  for  checkout  after  repair  within  the  model,  then 
the  repair  period  would  have  to  be  a  variable,  instead  of  fixed  as 
assumed  here,  since  successive  checkout  and  repair  activities  could 
occur  an  arbitrary  number  of  times  within  the  repair  period  itself. 

The  system  is  assumed  to  fail  exponentially  during  standby,  for 
both  detectable  and  undetectable  failures. 

Since  a  fixed  standby  period  is  assumed,  the  system  is  not  operat¬ 
ing  under  a  "calendar”  maintenance  policy,  i.  e.,  it  is  not  possible, 
in  general,  to  predict  the  future  time  intervals  to  which  the  system 
will  be  assigned  to  standby.  It  is  also  noted,  from  the  definition  of 
PAR  in  the  list  above,  that  when  a  system  is  in  checkout,  even  though 
it  is  good,  it  is  not  considered  available  for  its  mission. 

The  alert  readiness  of  a  system  subject  to  the  above  conditions 
is  given  by  Eq.  (54),  p.  13^of  Reference  9,  which  is  derived  in  Reference  1 
This  equation  is  shown  in  Figure  4.  The  complexity  of  the 
equation  made  it  desirable  to  develop  a  computer  program  to  analyze 
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numerically  the  effects  of  the  numerous  parameters  on  availability 
and  their  interactions.  The  remainder  of  this  section  describes  the 
computer  program  and  discusses  the  results  of  the  numerical  analysis. 


*1%  '4 


!  (1  -  «  1  -  (1  -  «)pd  pd  1  [l  -  e'(X*  +  V  T»1 


1  -  \  Pd  pu  pd  u  -  “J 

u  +  Pd  Pd  U  -  n2)  -  P  -  -  Pd  Pd  U  -  «)" 

t  t  t  t 

S  8  C 

8  SC 

(V  *T T 
'  a  u' 


aR  *  '  (1  -  M  fl  -  (1  -  a)  p  p^ 

t  t 

T  +T  +  .  _  a  c  j _ 

8  °  1  +  Pd  (1  -  w2)(i  -  P  -  -  Pd  Pd  U  -  a) 

fc8  tC1  *8  *6 

Figure  4.  Alert  Readiness  of  Periodically  Checked  System 

B.  Computer  Program 

To  aid  in  the  interpretation  of  results,  the  program  was  written 
to  allow  automatic  plotting  of  the  computer  output  in  graphical  form. 
Some  of  these  graphs  are  reproduced  in  Appendix  B  and  will  be  dis¬ 
cussed  shortly.  All  graphs  plot  P^,  probability  of  alert  readiness, 
as  the  ordinate,  and  one  of  the  twelve  independent  variables  as  abscissa. 
One  of  the  other  independent  variables  is  allowed  to  vary,  with  all 
others  held  fixed,  and  can  asstime  up  to  seven  different  values  for  any 
one  chart.  Each  figure  in  Appendix  B,  therefore,  has  at  most  seven 
curves,  each  curve  representing  the  relationship  between  PAR  and 
one  independent  variable  for  a  particualr  value  of  some  other  inde¬ 
pendent  variable  (the  parameter).  The  parameter  changes  from  curve 
to  curve,  the  values  being  indicated  at  the  top  of  each  graph.  Each 
value  is  associated  with  a  symbol  used  in  plotting  the  curve.  The 
values  are  written  in  terms  of  a  three-digit  figure  followed  by  a  two- 
digit  figure,  the  latter  representing  a  power  of  ten.  For  example, 

0.  300  00  is  0.  3,  0.  150  02  is  15,  and  0.  100  -02  is  0.  001. 

The  values  of  the  other  parameters,  which  are  held  fixed  for  each 
graph,  are  also  indicated  on  the  graphs. 

C.  Discussion  of  Results 

1.  Variable  Ts,  Parameter  a  (Figures  B-i  to  B-13) 


When  is  plotted  against  Tg,  there  will  always  be  some 
Tg  which  maximizes  P^^  (including  the  cases  where  the  maximum 
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occurs  at  Tg  =  oo).  Basically,  this  value  of  Tg  represents  an 
optimum  trade-off  between  time  lost  from  readiness  due  to 
checkout  (and  repair  of  apparent  failures  induced  by  checkout) 
and  time  lost  from  readiness  due  to  failure  in  standby.  However, 
there  are  other,  more  complex,  relationships  in  the  trade-off 
picture:  when  errors  are  committed  during  checkout  in  deciding 
whether  the  system  is  good  or  bad,  when  failures  are  induced 
during  checkout  which  go  undetected,  and  when  repair  is  faulty 
so  that  a  repaired  system  can  enter  standby  in  a  failed  condition. 
Some  of  these  complex  interactions  are  illustrated  when  the 
parameter  a  varies. 


Figures  B-l  through  B-5  show  versus  Tg  for  different 

values  of  a,  each  figure  being  for  different  values  of  T  and  T^. 
Figure  B-l  is  for  =  10,  =  1,  implying  that  checkout 

is  a  relatively  time-consuming  process  and  that  when  a  failure  is 
found,  it  is  repaired  quickly  (as  might  be  characteristic  of  replacement 
type  repair).  Since  repair  is  so  rapid,  it  might  be  expected  that 
a  would  have  little  effect,  since  no  significant  time  is  lost  by 
calling  a  good  system  bad.  This  lack  of  sensitivity  is  clear  from 
the  figure.  Although  there  is  slight  variation- with  a,  it  is  note¬ 
worthy  that  the  highest  value  of  a  (i.  e.,  the  highest  probability  of 
calling  a  good  system  bad)  gives  the  highest  value  of  PAD.  This 
will  be  discussed  further  below.  In  fact,  later  results  will  show 
that  the  above  remarks  need  to  be  qualified. 


Figures  B  -  2  to  B-5,  which  have  increasing  ratios  of  T^  to 
Tc»  illustrate  increasingly  greater  effects  of  a.  Figure  B-4  is 
for  the  same  values  of  Tc  and  Tf  as  Figure  B-3,  but  p  =  0.  5 
instead  of  0.  1.  Figure  B-5,  in  addition  to  having  a  larger  value 
of  T^,  has  a  larger  value  of  \^.  These  figures  also  illustrate  a 
"crossover"  phenomenon:  for  small  Tg,  tends  to  increase 

as  a  gets  smaller,  and  for  large  Tg,  this  relation  is  reversed. 
This  can  be  explained — as  can  the  fact  that  the  highest  a  gave 
the  highest  PAR  in  Figure  B-l — as  being  due  to  the  presence  of 
undetected  failures  in  the  system  (the  parameters  k  ,  p  ,  and 

U. 

1  -  Although  a  is  called  a  "false  alarm"  probability,  it 
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can  also  be  considered  as  a  type  of  preventive  maintenance 
factor.  This  has  been  previously  pointed  out  in  Reference  8. 
Usually,  preventive  maintenance  pertains  to  the  elimination  of 
incipient  failures,  whereas  a  eliminates  possibly  existing  but 
undetectable  failures;  otherwise,  the  similarity  is  apparent  . 
For  example,  an  optimum  replacement  policy  (a  form  of  pre¬ 
ventive  maintenance)  can  be  determined  by  setting  a  =  1  (and 

(3  =  0). 


With  this  interpretation  of  a,  Figures  B-i  to  B-5  are  more 
understandable.  For  sufficiently  large  standby  periods,  a — >1 
(for  maximum  P^.^)  if  there  are  any  undetectable  failures  pos¬ 
sible  during  standby;  and  for  any  Tg,  if  is  small  and/or 
undetectable  failures  are  likely,  replacement  may  be  optimum. 
(Of  course,  we  are  not  considering  here  the  added  burden  on 
the  logistics  system  of  returning  a  lot  of  good  units  for  repair.  ) 


In  Figure  B-5,  note  that  as  a  increases,  the  optimum  T 

s 

first  decreases,  then  increases.  This  effect  is  shown  more 
clearly  in  Figure  B-6,  for  which  Tf  =  5  instead  of  25.  The 
change  in  slope  of  with  a  as  Tg  increases  is  shown  in  Fig¬ 

ures  B-7,  B-8,  and  B-9,  whose  curves  represent  cross  sections 
of  Figures  B-3,  B-4,  and  B-5,  respectively,  at  fixed  values  of 

T  . 

s 


While  it  is  clear  from  the  figures  (particularly  Figures  B-5 
and  B-6)  and  the  above  remarks  that  undetectable  failures  during 
standby  make  it  desirable  to  replace  the  system  periodically, 
the  question  arises  as  to  the  effect  of  a  when  there  are  undetect¬ 
able  failures  during  checkout  and  repair.  The  effect  of  undetect¬ 
able  failures  during  checkout  (pu  )  is  shown  by  a  comparison 
^  tc 

of  Figure  B-10,  where  pUfc  =  1  (and  there  are  no  other  types 

of  undetectable  failures  present  either)  and  Figure  B-il,  where 
pu,  =  0.  6.  When  no  undetectable  failures  of  any  type  are 
present,  as  in  Figure  B-10,  achieves  its  maximum  when 

a  =  0,  as  expected.  If  pu  is  reduced  to  0.  6,  the  main  effect 
is  to  greatly  increase  the  optimum  time  between  checkouts,  in 
order  to  reduce  the  frequency  of  introducing  undetectable  failures. 
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(Of  course,  when  a  =  i,  PAR  becomes  independent  of  pu^  .  ) 
When  a  checkout  does  occur,  however,  the  system  should  auto¬ 
matically  go  into  repair  (a  =  1).  Obviously,  there  is  a  conflict 


between  the  criteria  for  optimum  checkout  time  for 

to  \  . 
u 


as  related 


Figures  B-12  and  B-13  show  the  effect  of  a  when  undetectable 
failures  occur  during  repair.  In  Figure  B-12,  p^  =  0.6  and  ■ 
p^  =  0.4,  so  that  no  undetectable  failures  occur  during  repair. 

In  Figure  B-13,  p^  is  reduced  to  0.  2,  for  the  same  p^,  which 
implies  that  half  the  failures  induced  by  repair  are  undetectable. 
Comparison  of  the  figures  shows  that  this  just  has  the  effect  of 


reducing  without  strongly  affecting  the  optimum  checkout 

period  and  without  making  periodic  replacement  desirable  (a  *  0 


is  best).  The  reason  for  this  is  clearly  that  undetectable  failures 
are  introduced  only  through  replacement/repair. 


2. 


Variable  T  ,  Parameters  (3,  p^,  and 
(Figures  B-14  to  B-20)  _ 


For  a  fixed  value  of  Tg,  P^^  is  monotonically  decreasing 

with  increasing  p,  \s,  and  X.  ,  and  with  decreasing  p^,  as  shown 

in  Figures  B-14  to  B-20.  The  optimum  checkout  period  decreases 

substantially  as  (3,  X.g,  or  X.  increases,  whereas  it  does  not  change 

with  p^;  i.  e.,  more  frequent  checkout  does  not  help  if  repair 

becomes  less  (or  more)  reliable.  The  coincidence  of  the  PAD 

peaks  for  varying  p^  is  seen  more  clearly  in  Figure  B-17,  where 

T  =  10  and  T  =  1. 
c  r 


Figures  B-18,  19,  and  20  are  cross  sections  of  Figures  B-14, 
B-15,  and  B-16,  respectively.  Figure  B-18  shows  that  |3  has 
much  less  effect  for  small  T  than  for  large  T  ,  since  if  a  bad 
item  is  not  detected,  it  probably  will  be  at  the  next  checkout,  and 
if  there  is  not  much  time  between  checkouts,  there  is  a  smaller 
reduction  in  readiness.  Figure  B-19  illustrates  the  almost 
linear*  relationship  between  alert  readiness  and  repair  effective¬ 
ness  and  shows  that  the  slope  changes  slowly  with  increasing  Tg. 
Figure  B-20  describes  the  known  fact  that  standby  failure  rate 
has  an  increasingly  large  effect  as  time  between  checkouts  increases. 
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3.  Variable  Tg,  Parameters  \  ,  pu  ,  and  1-p, -\±? 

(Figures  B-21  to  B-25) _ ^ _ 

The  effects  of  changing  undetectable  failure  rates  are  shown 
in  Figures  B-Zl  to  B-25.  The  change  in  for  different  X.  ’s 

and  a  =  0.  1  is  shown  in  Figure  B-21.  The  values  of  the  lower 
curves  could  be  increased  significantly,  however,  as  discussed 
earlier,  by  increasing  a.  It  was  also  previously  remarked  that 
decreasing  p  increases  the  optimum  period  between  checkouts; 

tc 

this  is  illustrated  in  Figure  B-22.  The  probability  of  undetect¬ 
able  failures  during  repair,  1  -  p^  -  p^,  is  varied  in  Figure  B-23 
by  keeping  p^  fixed  at  0.  6  and  varying  p^  between  0  and  0.  4.  As 
was  the  case  with  p^,  there  is  no  marked  effect  on  optimum 
checkout  period. 

Figures  B-25  and  B-25  are  cross  sections  of  Figures  B-22 

and  B-23,  respectively  (a  cross  section  of  Figure  B-21  is  not 

shown,  as  it  does  not  differ  largely  from  Figure  B-20  for  X.  ). 

s 

As  with  p^,  an  approximately  linear  relationship  holds  between 
1  -  Pi  -  p^  and  P  for  a  given  value  of  Tg. 

4.  Variable  Ts,  Parameters  p_,  and  pd 

(Figures  B-26to  B-31) _ _cj _ c2 

These  parameters  produce  effects  similar  to  some  of  those 

already  discussed.  1-Pd,.  is  similar  to  a,  and  all  of  the  remarks 

cd 

concerning  a  apply  also  to  1-pdt  •  If  Pd).  Is  small,  this  means 

1  c  1 

that  good  systems  will  tend  to  fail  during  checkout  and  go  into 
repair,  thus  removing  undetectable  failures.  Figure  B-26  graphs 
P^j^  versus  Tg  for  different  values  of  p^  .  Note  that  for 
2.  5  Tg  7.  5,  the  bottom  two  curves  represent  the  extreme 

values  of  p^  ;  the  lowest  curve  is  for  pd,.  =  1*  and  the  next 

tc  1  rc  l 

higher  curve  is  for  Pdtc  ^  =  0.  This  implies  that  for  a  fixed  Tg 

in  this  region,  there  is  an  optimum  value  of  pdto  other  than 

0  or  1.  This  effect  is  shown  in  Figure  B-27,  which  is  a  cross 

section  of  Figure  B-26  for  Tg  =  5.  In  Figure  B-27,  is 

shown  as  a  function  of  pdf  for  T  =  5,  for  different  values 

of  \u.  It  is  seen  that  as  X.^1 increases,  the  optimum  Pdtc 

decreases,  illustrating  the  ’’preventive  maintenance"  character 

of  this  parameter  (similar  to  a)  mentioned  above. 


When  \  and/or  T  is  increased,  pj  has  a  stronger  effect, 

\1  V  Cq  j 

as  with  a.  This  is  shown  in  Figures  B-28  and  B-29.  Cross  sec¬ 
tions  of  the  curves  of  Figure  B-28  for  fixed  Tg  are  shown  in 
Figure  B-20.  This  figure  shows  that,  if  T  is  sufficiently  large, 
it  is  always  best  if  the  unit  fails  during  checkout  before  the  point 
of  test  decision,  assuming  there  is  a  reasonable  chance  of  detecting 
the  failure. 


The  parameter  p^t  is  similar  to  p^,  in  that  it  directly  affects 

2 

the  probability  of  entering  standby  good.  However,  as  shown  in 
Figure  B-31,  the  optimum  Tg  increases  as  p<jt  decreases, 
whereas  it  remained  unchanged  with  p^  (as  shown  previously 
in  Figures  B- 15  and  B- 17  ). 

5.  Variable  Tg,  Parameters  Tc  and  Tr  (Figures  B-32  to  B-35) 

The  increase  in  optimum  standby  time  with  increasing  check¬ 
out  and  repair  time  is  shown  in  Figures  B-32  and  B-33.  Cross 
sections  of  these  curves  for  fixed  T  ,  Figures  B-34  and  B-35, 

S 

show  that  as  standby  time  increases,  the  duration  of  the  checkout 

and/or  repair  period  becomes  less  and  less  significant,  as  expected. 

This  is  the  opposite  effect  of  that  found  for  the  parameters  X.  and 

8 

and  X.  . 

u  • 

6.  Variable  o,  Parameters  X.  ,  pn.  ,  p,  and  T 
(Figure.  B-36  to  B-40)  u  ^  Z 


This  series  of  figures  showB  in  more  detail  the  changing 
relationships  between  P^  and  a,  depending  on  the  frequency  of 
occurrence  of  undetectable  failures.  In  Figures  B-36  and  37, 
which  have  different  values  of  T  ,  the  parameter  is  X.^.  For 
the  values  of  shown,  an  optimum  value  of  a  exists.  For 

small  values  or  large  values  of  \u,  not.  shown  in  Fig.  B-36,  it 

may  be  best  never  to  replace  units  which  are  not  detectably 

bad,  or  always  to  replace  them,  respectively.  The  next  figure 

(B-38)  illustrates  a  similar  behavior  as  p  varies,  except 

ut 

c 

that  the  optimum  o'  is  almost  always  0  or  l.  The  curves  all  meet 

at  or  ■  l,  since  the  value  of  pu  is  immaterial  if  the  unit  is 

....  cc 

always  replaced. 
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An  interesting  series  of  curves  results  (Figure  B-39)  when 
p^  is  varied,  keeping  p^  fixed.  The  quantity  1  -  P^  -  P£  *s  Pr°k~ 
ability  of  undetectable  failures  occurring  in  the  repaired/replaced 
unit.  In  the  figure,  p^  is  held  constant  at  0.  6,  meaning  40  percent 
of  the  time  a  unit  comes  out  of  repair  it  will  be  in  a  failed  condi¬ 
tion.  The  top  curve  is  for  p^  s  0.  4,  which  means  that  none  of  the 
failures  in  repair  are  undetectable;  and  the  bottom  curve  is  for 
p^  =  0,  meaning  all  of  the  failures  are  undetectable.  The  curves 
in  between  represent  the  case  where  some  of  the  failures  are 
detectable.  Some  of  these  intermediate  curves  exhibit  a  maximum 
P  for  0  a  1.  The  curves  cross  at  the  point  where  a  =  1  -  (3 
(i.  e.,  a  =  0.9).  This  point  represents  the  unrealistic  situation 
where  the  unit  is  equally  likely  to  be  judged  bad  (during  checkout) 
regardless  of  its  actual  condition.  Thus,  units  coming  out  of 
repair  with  undetectable  failures  experience  the  same  treatment 
at  the  next  checkout  as  units  with  detectable  failures — both  have 
a  90  percent  probability  of  entering  repair.  For  values  of  a 
larger  than  0.  9,  for  example  a  ®  1,  it  is  interesting  to  note 

that  it  is  best  to  have  all  failures  occurring  during  repair  be 
undetectable.'  The  reason  for  this  is  that  units  leaving  repair 
with  detectable  failures  may  not  be  repaired  at  the  next  checkout, 
as  there  is  a  probability  of  (3  that  the  failure  will  not  be  detected. 
Units  with  only  undetectable  failures,  however,  will  always  enter 
repair,  if  they  do  not  fail  detectably  before  the  next  test  decision. 


Figure  B-40  shows  the  changing  slope  of  with  a  as  repair 
time  increases,  discussed  previously  (also,  see  paragraph  8  below). 

7.  Variable  Pdtcl*  Parameters  pU{.  ,  p^  and  Tf  (Figures  B-41  to  B-43) 

These  figures,  when  compared  with  Figures  B-38  to  B-40, 

illustrate  the  similarity  mentioned  earlier  between  the  effects  of 

Pd*  and  a,  the  value  of  pdt  =  0  corresponding  to  a  =  1, 

lc1  [ci 

Pd*  =  f  to  a  =  0,  Below,  these  similarities  are  discussed 


Pdt 

tc  i 
further. 
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Some  types  of  units  cannot  be  checked  out,  or  it  may  be  too 

costly  or  time-consuming  to  check  them  out.  This  is  the  reason 

the  parameters  X  ,  pu.  ,  and  p,  appear  in  the  equation  for  alert 
u  “tc  £ 

readiness.  An  entire  missile  cannot  be  checked  out  completely 

until  it  is  fired,  and  therefore  may  accumulate  undetectable 

failures  and  should  be  replaced  periodically.  The  optimum 

period  between  replacements  for  a  unit  not  subject  to  checkout 

can  be  investigated  by  setting  a  =  1,  and  (3=0.  This  means 

that  at  every  time  interval  Tg  the  unit  is  replaced  regardless 

of  condition.  For  convenience,  T  and  X.  can  be  set  equal  to 

c  s 

O,  since  they  are  not  distinguishable  from  T^  and  X  ,  in  this 
case.  Since  there  is  no  checkout,  many  of  the  parameters  are 
immaterial;  the  only  ones  of  interest  are  T  ,  T  (or  T  ),  X. 

(or  X.  ),  and  p^.  Figures  B-44  and  B-46  show  as  a  function 

of  Tg  for  different  values  of  X^,  T^,  and  p^,  respectively.  The 

optimum  replacement  time  is  seen  to  decrease  significantly  as 

X  increases  or  T  decreases, 
u  r 

In  discussing  a  and  pd«.  in  paragraphs  6  and  7  above,  it  was 

c  1 

found  that  there  are  many  cases  where  does  not  have  its 

maximum  value  at  a  =  0  or  Pdtc  ^  s  1.  This  can  be  interpreted 

to  mean  that  in  these  cases  there  is  an  optimum  replacement 

period  for  apparently  good  items,  and  the  maintenance  policy 

consists  of  concurrent  optimum  checkout  and  replacement  periods. 

The  simplest  case  is  when  a  =  1  or  p^,.  =  0  gives  the  maximum 

C1 

P.  ,  since  this  means  the  units  should  always  be  replaced  after 
a  standby  period  of  T  .  Or,  if  the  maximum  occurs  when  a  =  0 
or  Pdtc  =  the  unit  should  never  be  replaced  unless  the  check¬ 
out  specifies  it  as  bad.  What  if  the  maximum  occurs  at  some 

intermediate  value,  say  a  =  1/2  or  p^  =1/2?  It  is  interesting 

Ccl 

to  speculate  as  to  whether  this  type  of  information  could  be  used 
to  arrive  at  an  optimum  maintenance  policy. 

We  will  assume  that  the  actual  checkout  and  test  procedures 
themselves  are  completely  specified,  so  that  the  question  of 
arriving  at  an  optimum  policy  refers  only  to  the  times  at  which 
checkout  and  replacement  occur.  With  the  checkout  procedures 
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specified,  there  can  be  assumed  to  be  an  actual  a,  [3,  Pdtc^» 
etc.,  characteristic  of  the  test,  whose  values  will  be  assumed 
to  be  known  or  to  have  been  estimated.  If  the  actual  a  =  0.1, 
say,  and  the  optimum  a  =  0.  5,  should  increase  if,  on  a 

random  basis,  more  items  are  called  bad  during  checkout  than 
would  normally  be  the  case.  This  method  will  also  affect  (3, 
however,  since  it  is  not  known  for  sure  during  checkout  that  a 
given  item  is  really  good  or  bad.  However,  (3  will  decrease  in 
the  above  procedure,  since  more  bad  items  will  also  be  rightly 
called  bad.  Both  effects  increase  alert  readiness.  Such  a 
randomizing  procedure  could  not  be  used,  however,  if  the  actual 
a  were  larger  than  the  optimum;  although  a  could  be  reduced  (to 
0  if  necessary)  by  calling  fewer  items  bad,  (3  would  increase  and 
adversely  affect  alert  readiness. 

It  should  be  remarked  that  this  procedure  has  the  effect  of 
changing  a;  but  it  has  not  been  shown  that  a  specific,  desired 
value  of  a  (and  a  corresponding  (3)  can  be  arrived  at  this  way. 

More  useful  and  realistic  than  such  a  randomizing  procedure, 

however,  would  be  a  periodic  replacement  of  '.'good"  units  which, 

combined  with  the  actual  a  (or  pdt  ),  results  in  an  a  (or  pdt  ) 

c  1  c  1 

which  is  near  optimum.  For  example,  if  actual  a  =  0,  optimum 

a  -  0.  5,  instead  of  randomly  replacing  half  the  good  units  at 
each  checkout  period,  each  unit  which  experiences  no  detectable 
failures  during  two  standby-checkout  cycles  could  be  replaced. 

This  would  have  the  same  effect  of  half  the  "good"  units  being 
replaced  and  would  result  in  a  higher  P.  „  than  the  random  method, 
since  the  undetectable  failures  accumulate  with  time,  and  no  unit 
would  be  left  unreplaced  for  more  than  two  standby  periods.  If 
the  optimum  number  of  standby  periods  before  automatic  replace¬ 
ment  turns  out  to  be  a  nonintegral  number  between  k  and  k+  1, 
then  a  randomizing  procedure  could  be  used  to  replace  part  of 
the  items  after  k  cycles  and  all  the  remaining  items  after  k  +  1 
cycles. 
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Table  A-l;  p_  =  1*0,  a  =  0.1 
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Table  A-4:  p  =  0.8,  q  =  0.1 
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Table  A- 5:  p_  =  0.8,  a  =  0.3 
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Table  A- 10:  S  »  P-=P  =  P„  =  1.0,  p  =0.5 


T&ble  A- 11;  E=n=p=p  =  1,  P  =  0.5  (Favorable  Case  for  a) 
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Table  A-13:  H  =  P  =  p  =  1.0.,  p  =  a  =  0  (Favorable  Case  for 
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Table  A-l4:  Comparison  of  Favorable  a  Cases 
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Ps  =  Pc  =  1*0,  Pc  =  0.5 
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1.0 
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1.0 
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.8 
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1.0 
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1.0 
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1.0 
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Table  A-l4  (Cont.) 
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Table  A-15:  Comparison  of  Favorable  E  Cases 
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Table  A-15  (Cont.) 
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Variance 
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Table  A-l6:  Estimation  of  All  Parameters  (Except  pc  )  and  Availability 


in  parentheses  were  used  in  calculating  P(G)  and  A. 
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APPENDIX  B 


GRAPHS  SHOWING  RELATIONSHIP 
OF  AVAILABILITY  TO 

OPERATIONAL  AND  MAINTENANCE  PARAMETERS 
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